Prosody Dependent Speech Reco Duration Modelling at Intonatio
نویسنده
چکیده
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the lengthening of speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM) is implemented to provide an accurate phoneme duration model. This study is conducted on Boston University Radio News Corpus with prosodic boundaries marked using ToBI labelling system. We found that lengthening of the phrase final rhymes can be reliably modelled by EDHMM, which significantly improves the prosody dependent acoustic modelling. Conversely, no systematic duration variation is found at phrase initial position. With prosody dependence implemented in the acoustic model, pronunciation model and language model, both word recognition accuracy and boundary recognition accuracy are improved by 1% over systems without prosody dependence.
منابع مشابه
Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the duration lengthening effects of the speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM)...
متن کاملAn Overview of Prosodic Modelling for Croatian Speech Synthesis
In order to include prosody into the text to speech (TTS) systems prosody knowledge needs to be acquired, represented and incorporated. Two main features of prosody important for modelling prosody for TTS systems are duration and F0 contour. There are various approaches to modelling those features and they can be categorized into three main groups: rule based, statistical and minimalistic. Some...
متن کاملMeLos: Analysis and Modelling of Speech Prosody and Speaking Style
This thesis addresses the issue of modelling speech prosody for speech synthesis, and presents MeLos: a complete system for the analysis and modelling of speech prosody “the music of speech”. Research into the analysis and modelling of speech prosody has increased dramatically in recent decades, and speech prosody has emerged as a crucial concern for speech synthesis. The issue of speech prosod...
متن کاملStatistical Modelling of Speech Segment Duration by Constrained Tree Regression
This paper presents a new method for statistical modelling of prosody control in speech synthesis. The proposed method, which is referred to as Constrained Tree Regression (CTR), can make suitable representation of complex effects of control factors for prosody with a moderate amount of learning data. It is based on recursive splits of predictor variable spaces and partial imposition of constra...
متن کاملUsing prosody to improve Mandarin automatic speech recognition
In this paper, these problems of how to model and train Mandarin prosody dependent acoustic model and how to decode input speech based on prosody dependent speech recognition system will be discussed. We use automatic prosody labeling methods to annotate syllable prosodic break type and stress type on continuous speech corpus, and utilize our proposed methods to train prosody dependent tonal sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003